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1.  Introduction 

The  estimation  problem  in  essence  is  the  follcwing.  We  have  an 
observed  process  y(t)  (n  x 1 matrix  function)  which  lias  the  fora 


y(t)  = SCO,  t)  + H(t)  0 < t < T 


(1.1) 


where  'O'  denotes  a vector  of  unknown  parameters  which  we  want  to  estimate, 
S(0,  t)  being  a stochastic  process  ("signal")  which  is  completely  specified 
once  0 is  specified  (by  means  of  a stochastic  differential  system,  for 
example)  and  N(t)  is  a stochastic  process  which  models  the  errors  (that 
remain  even  after  all  ' systematic ' errors,  such  as  bias  and  calibration 
errors,  have  been  accounted  for).  There  is  much  evidence  to  suggest  that 
the  noise  process  may  be  well  modelled  as  Gaussian,  and  independent  of 
the  signal  process.  This  is  a basic  assumption  thruout  in  this  paper. 

Under  the  title  of  "System  Identification"  there  is  a large 
engineering  literature  dealing  with  such  problems.  This  is  well  documented 
in  the  proceedings  of  three  symposia  [1]  devoted  exclusively  thereto. 

In  the  bulk  of  this  literature,  the  process  S(6,  t)  is  taken  to  be 
deterministic,  in  which  case  the  estimation  is  largely  treated  as  a 
'Least  Squares'  problem  of  minimising 


/f  ? 

/ | |y(t)  - S(0 , t) | | dt 

Jr\ 


over  a predetermined  admissible  set  of  parameters  0.  Where  the  stochastic 
signal  case  is  considered,  it  is  reduced  to  the  time-discrete  version 


of  (1.1): 


y = S (0)  + N 
^n  n n 


(1.2) 


J 


for  the  reason  that  the  continuous  time  is  mathematically  too  difficult 
to  handle,  and  anyhow,  in  digital  computer  processing  (as  is  the  rule), 
it  is  so  discretized  in  the  A-D  conversion  process  anyway.  Ihis  is 
indeed  true;  but  the  authors  invariably  proceed  to  make  the  assumption 


that  the  noise  samples  {N  } are  mutually  independent.  But  this  require:; 


that  the  sampling  rate  (in  the  periodic  sampling  of  the  data)  be  not  more 
than  twice  the  postulated  'bandwidth*  of  the  noise,  itself  actually  unknown. 
Indeed  in  most  practical  cases  the  sampling  rate  is  far  higher  than  twice 


the  bandwidth.  To  meet  this  objection,  one  may  then  allow  tire  {N  } to  be 


correlated.  But  then  the  correlation  function  must  be  known , and  anyone 
with  experience  in  handling  real  data  can  easily  appreciate  that  it  is 
unrealistic  to  require  that  much  knowledge  of  the  noise  process,  even  if 
the  complication  in  the  theory  can  be  borne. 


Vfe  maintain,  in  any  event,  that  it  is  much  better  to  work  with  the 


time-continuous  model  (1.1),  allowing  as  high  a sampling  rate  in  Die 
processing  as  the  A-D  converter  is  designed  for.  But  in  the  tine- 
continuous  model  we  are  faced  with  another  problem.  'The  basic  tool  in 
estimation  is  tire  likelihood  functional  (for  fixed  parameters)  which  is 
based  on  the  Radon-Nikodym  derivative  of  the  probability  measure  induced 
by  the  process  by  y(‘)  to  that  induced  by  the  noise  process  N(t).  But 
this  derivative  is  too  difficult  to  calculate  even  when  the  precise 
spectrum  of  N(*)  is  known,  which  it  is  not.  What  we  can  assert  for 
sure  is  that  the  bandwidth  of  noise  N(t)  is  much  larger  than  that 
of  the  process  S(0;  t),  which  is  essential,  in  order  that  the  measuring 
instrument  does  not  'distort'  the  signal.  At  this  point  it  was  customary 
in  the  earlier  engineering  literature  to  introduce  "white  noise"  in  a 
formal  way  as  a stationary  stochastic  process  with  constant  spectral 
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density  to  represent  the  'large  bandwidth'  nature  of  N(t).  With  the 


advances  in  the  theory  of  diffusion  processes  using  the  Jto  integral, 
it  became  fashionable  to  use  a Wiener  process  model  as  being  "more 
rigorous"  [23.  Thus  we  replace  (1.1)  by 


Y(t) 


S(0,o)da 


+ W(t) 


(1.3) 


where  W(t)  is  a Wiener  process.  We  can  then  exploit  the  well-developed 
machinery  of  martingales  and  I to  intergrals . In  fact  the  likelihood 


function  can  then  be  expressed  as:  (see  [23): 

T T 

Exp-  1/2  { J | |S(0,t)| |2dt  - 2 J [S(0>t),  dY(t)3]  (1.4) 

0 0 
A 

where  S(Ojt)  is  the  best  mean  square  estimate  of  S(0,t)  given  the  sigma- 
algebra  generated  by  Y(s),  s <_  t.  This  formula  can  be  justly  considered 
as  one  of  the  triumphs  of  the  Ito  theory,  the  key  to  the  success  being 


the  appearance  of  the  Ito-integral  in  the  second  form  of  (1.4).  This 
integral  is  defined  on  the  basis  that  Y(t)  is  of  unbounded  variation 
with  probability  one.  Of  course  no  physical  instrument  can  produce  such 
a waveform.  To  calculate  it,  given  the  actual  observation  (1.1),  we  can 
"retrace"  our  steps  bads,  from  (1.3)  and  use 


y(t)dt 

in  place  of  dY(t).  But  this  is  totally  incorrect,  unless  S(0,t)  is 
deterministic,  and  any  minimisation  procedure  based  on  it  leads  to 
erroneous  results.  This  point  is  not  appreciated  by  authors  using  0.3) 
as  "more  rigorous"  , perhaps  because  they  have  not  had  occasion  to  actually 
calculate  anything  based  on  real  data.  In  any  data  generated  by 
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1 


digital  computer  simulation,  which  must  perforce  employ  the  discrete 
version  (1.2),  this  point  can  be  completely  masked  and  hence  never 
appreciated. 

Faced  with  this  difficulty  we  have  to  examine  more  precisely  the 
model  again,  to  see  a physically  more  meaningful  way  of  exploiting  the 
fact  that  the  noise  bandwidth  is  large  compared  to  the  signal  bandwidth. 
What  is  needed  is  the  ' asymptotic  form'  of  the  likelihood  functional  as 
the  bandwidth  expands  to  infinity  in  an  arbitrary  manner’. 

Such  a theory  has  been  developed  by  the  author  using  a precise  notion 
of  white  noise.  This  is  explained  in  Section  2.  Based  on  this  theory 
we  derive  a likelihood  functional  in  Section  3.  It  turns  o6t  that 
formula  (1.4)  is  replaced  by 

T t 

Exp  - 1/2  ( f | |S(C,t)| |2dt  - 2 f S(0,t)  y(t)dt 

l Jo 

T 

+ f (||S(0,t)|r  ~ I |S(6,t)|  |?)dtl  (1.5) 

J 0 ’ 

where  a denotes  conditional  expectation  given  the  data  upto  time  t. 

Note  that  a third  term  appears  which  can  also  be  expressed  as : 


,t)  - S(0,t)  | Inl- 


and in  the  case  where  S(0,t)  is  Gaussian,  this  reduces  to 


|2.1dt 


being  thus  the  integral  of  the  mean  square  error  in  estimation  of  the 
signal  S(0,t)  from  the  observation  upto  time  t.  When  the  signal  process 
can  be  described  in  terms  of  stochastic  differential  equations,  whether 
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finite  or  infinite  dimensional  advantage  can  be  taken  of  the  fact  that 
the  neon  square  error  can  be  evaluated  by  solving  a Riccati  equation. 
Section  'I  is  devoted  to  this  specialization.  Section  5 deals  with  the 
application  to  the  problem  of  stability  and  control  derivatives  from 
flight  test  data  taking  turbulence  into  account,  'flic  algorithms  used 
and  results  obtained  on  actual  flight  data,  are  included. 


BASIC  NOTIONS 


2.  WHITE  NOISE: 


let  H denote  a real,  separable  Hilbert  Space  and  let 


W = L2  [0,  T;  H],  0 < T < 

denote  the  real  Hilbert  Space  of  H- valued  weakly  measurable  functions 
u( • ) sudi  that 


T 

f [u(t)j  u(t)jdfc  < « 


with  innei^producrt  defined  by 


f * 

Tu,  v]  = / LuCt),  v(t)]dt 

Let  yf,  denote  <5*n*.-s  measure  on  W (on  the  cylinder  sets  with  finite  dimensional 
Borel  basis)  with  characteristic  function 


a,(h)  = Exp  - l/2th,  h],  h e W. 

b 

Elements  of  W under  this  (finitely  additive)  measure  will  be  'white  noise 
sample  functions' , denoted  on  This  terminology  appears  to  have  the  sanction 
of  usage;  see  Skorokhod  [3]  for  example . It  is  essential  for  us  that  W is 
an  L.^-space  over  a finite  interval. 

Any  function  f(‘)  defined  on  W into  another  Hilbert  Space  II  sudi  that 
the  inverse  images  of  Boix’l  sets  in  II  aiv:  cylinder  sets  will  be  called  a 
'tame'  function.  Sec  Gross  [ 4].  As  is  readily  seen,  the  class  of  tarn- 
functions  is  a linear  class.  Since  the  inverse  image  of  the  whole  space  11 
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must  be  cylindrical,  it  is  clear  that  any  fane  function  has  the  form  fdVO 
where  P is  a finite  dimensional  projection. 

To  introduce  the  notion  of  a 'random  variable’  let  us  iirst  confine 
ourselves  to  the.  case  where  11  is  finite  dimensional : s R51  say.  We 

intxodu.ee  a metric  into  the  linear  space  of  tame  functions  by: 


iiif-dii  / 
w 


I! f-bi dy 
i+llf-gll  ° 


and  then  complete  lire  space,  the  completion  yielding  a Predict  Space.  Every 
element  of  the  completed  space  is  called  a ’randan  variable'  arid  if  C denotes 
suali  an  element  and  f (w)  a consesporidang  Cauchy  sequence  in  probability, 
then  we  define  the  corresponding  'distribution  function ' or  probability 


measure,  on  if*  to  be  tlxat  induced  by  the 


characteristic  function 


if frdoO . h’h 


f '>  m 

v t.  t \J  / 


Th«.  latter  limit  exists  (uniformly  on  bounded  sets  of  R ~ H^). 

In  the  case  where  II  is  no  longer  finite  dimensional,  we  shall  still 
identify  Caudiy  sequences  in  probability  of  tame  functions  as  "weak  random 


valuables" . Hie  limit  in  (2.0)  still  holds, 
1-1  but  the  limit  may  in  general  only  define 


uniformly  on  bounded  sets  in 
a "weak  distribution"  on  IIr. 


recall  in  this  connection  the  Sazonov  theorem  t 5]  that  the  limit  is  the 


We 


characteristic  function  of  a probability  measure  if  and  only  if  it  is 
continuous  in  the  trace-norm  topology  ( 'S- topology'  see  belcw) . 'this  is 
automatically  the  case  if  the  sequence  is  Caudiy  in  the  mean  square  sense j 
and  we  shall  then  drop  the  qualification  "weak". 
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l/'t  f(u)  lx:  any  Borel  measurable  function  i rapping  W into  H^.  then 
f(FV')  is  tame  for  every  finite-dimensional  projection  ojxsrotor  it  Let  {P  } 
denote  a sequence  of  finite  dimensional  projections  converging  strongly  to 
the  Identity,  the  sequence  iioy  be  assumed  to  be  monotone.  If  the  sequence 
f(P  w)  is  Cauchy  in  probability,  then  we  may  associate  a (weak,  ingeneral) 
random  variable  with  f(*)»  lvst  us  denote  it  by  f (a  notation  used  by  Gross). 


This  limit  of  course  can  depend  on  the  particular  projection  sequence  chosen. 
Of  primary  interest  to  us  are  those  functions  f(*)  for  which  {f(P  «)}  is 


Cauchy  in  probability  for  every  such  sequence  of  finite  dimensional  projections 
and  moreover  such  that  all  such  Cauchy  sequences  are  equivalent  so  that  the 
limit  random  variable  f is  unique.  In  that  case  we  say  that  f(u>)  is  a (weak) 
random  variable.  We  shall  use  the  term  "randan  variable"  if  the  corresponding 
measure  is  countably  additive;  we  shall  be  dealing  in  the  sequel  only  with 


ttv'  ( '<  v i\;m'  vt"i  nv. . 


The  simplest  function  one  can  consider  is  perhaps  the  lineal-' 


fund  ion : 


f(w)  = lai 

whore  L is  a linear  bounded  trvans formation  mapping  W into  H , where  we  no.: 

alio.;  H to  be  infinite  dimensional.  Then  it  is  easy  to  see  that  if  L is 
r 

Hilbeit-Schmidt , then  {LP  u)  is  Cauchy  in  tire  mean  square  sense,  and  lm 
is  a randan  variable.  Convsersely  L must  be  H.S.  if  I ja  is  to  be  a random 
vari able . 
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What  is  the  class  of  functions  which  are  random  variables?  To  answer 
this  question,  at  1 ast  in  part,  let  us  introduce  the  S- topology  on  V/:  this 

is  the  (locally  convex)  topology  induced  by  seminonas  of  the  form: 

p(w)  = v^CSw,w]  (2.1) 

where  here  (and  hereinafter)  S will  denote  a self-adjoint,  non-negative 
definite  trace-clous  operator1  on  W into  W.  For  tire  case  where  H = , 

Gross  [ J has  given  a sufficient  condition:  f(-)  is  a randan  variable  if 

it  is  uniformly  continuous  in  the  S~topology.  Uniform  continuity  means  that 
given  c > 0,  we  can  find  p(.)  such  that 

| |f(x)  - f (y) | | < e for  all  x,  y such  that  p(x-y)  < 1. 

Unfortunately  Gross  does  not  seem  to  discuss  non-trdvial  examples  of  functions 
satisfying  tliis  condition.  Here  we  shall  give  a sufficient  condition  for  a 
class  of  randan  variables  with  finite  second  moment. 

Hieorem  2.1 

Let  Pn(w)  denote  a homogeneous  polynomial  of  degree  n mapping  W into  K . 
Suppose  it  is  continuous  at  the  origin  in  the  S-topology.  Let  P denote  any 
finite  dimensional  projection. 

Sup  E(| |pn(Pw)| |2)  < « (2.2) 

P 

where  the  supremum  is  taken  over  the  class  of  all  finite  dimensional  projections. 
Conversely,  if  (2.2)  holds,  then  ^ • ) is  continuous  at  the  origin  in  the 
S-topology. 

Proof  We  begin  with  a simple  but  useful  Lemna. 
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bsnnva  2.1 

Suppose  pn(*)  is  continuous  in  the  S-topology  at  the  origin.  Then  there 
exists  a seminonn  in  the  S-topology: 

p(w)  = \/cScu  sco3  (2.3 


If  p(w)  = o,  then  for  any  positive  number  k, 


Hie  {i^}  is  a sequence  of  independent  zero-moan  unit  variance  Gaussians  and 
(2.6)  defines  a tame  function.  Moreover  we  can  readily  calculate  (by  expressing 
(2.6)  in  terms  of  Hermit e polynomial  for  instance)  that 

E(||pn(PW)||2) 

Cn/2]  . o m mm  m 

= E ( — > E E ||  E ...  E 

v=  0 (n-2v)!2  vl  i2utl=l  Vl  q=l  iv=l 


a . ...  • I 1 2 

-1;L  5 ■'*•]_  > • • • \ y \ >■ x2 v+!  » ’ * xn ' ‘ 


But  from  Lemaa  2.1,  we  liave  that 


| ( (Poj)  | | 2 < [Smoj,w]n 


(2.7) 


(2.8) 


wl  icre 

s = r sp 

m 

and  is  of  course  trace-class  and  finite  dimensional. 
Hence 


E[||pn(Pu))|n  1 E(CSmaj,m]n) 


(2.9) 


Let  ik  , k = l,...v,  be  1) ie  orthonormalized  eigen-vectors  of  Sm  with  corresponding 

non-zero  eigen-values 

Then 


[S^w, w]  = L 


and  we  have 


E([S  iii,u]n)  = f(Tr.S  , Tr.S^.Tr.S11) 
m m m’  m 
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where  f( • • • ) is  a fixed  continuous  function.  Of  course 


Tr. 

m 

is  monotone  in  m for  each  j and  converge  to 


Tr. 


Hence  it  follows  that 

E[||pn(Po>)||2]  < <* 


for  all  finite  dimensional  projections. 

To  prove  the  converse , suppose  (2.2)  holds.  Tne  (2.7)  holds  for  every  m,  and 
taking  v=  0 therein,  we  obtain  that 


CO 


h 


i it. 
! I 


/A, 

\ S' 

n 


< eo 


(2.10) 


for  every  orthonormal  sequence  Hence  ( • ) is  Hilberl'-Schmidt.  Of 

course 


iiPno>ii2  <m  iiuii2" 


(2.11) 


Define  new  S by: 

[Sw,w]  = (|  | I^(w)  | 

Then  S is  Hilbert-Sdunidt  by  (2.10). 

For  any  finite  dimensional  projection  P, 


E[SPw,Pu>]  r E[PSP(jj,w] 

= E((||pn(Pu))||2)1/n) 


ID 


and  hence 


;! 


sup  F.[PSPa),u)]  < 00 
P 

But  taking  the  orthonormal  basis  of  eigen- vectors  of  S,  it  follows  that  S 

I 

is  traoe-class. 

It  follows  from  Theorem  2 . 1 that  if  a homogeneous  polynomial  is 
uniformly  continuous  in  the  S-topology,  the  corresponding  random  variable 
has  finite  second  moment. 

For  a homogeneous  polynomial  of  degree  2 with  range  in  K (Hp  - R or 
if  :nQre  generally)  we  can  prove  that  continuity  at  the  origin  in  the  S-topology 
is  sufficient  to  make  it  a random  variable.  For  from  (2.7)  we  have 
; mm  9 m .9 

E[||p2(Pu.0||23  = l Z Ik^^j)!  + |z  kjC*^)!-  < » 

and  hence 

CO 

Elk  ($.,4.) I <~ 

l 

for  any  orthonomal  system.  Hencp  it  follows  that 

Et||p2(pn«)  - P2<Pm“)!|23 


is  Cauchy.  This  suffices  for  our  purposes  here.  See  [6]  for  more,  and  in 
parlicular  the  relation  to  multiple  Ito  integrals. 


3.  RADON-NIKODYM  DERIVATIVES  Or  WEAK  DISTRIBUTIONS. 


Let  w denote  white  noise  samples  as  in  Section  2 and  let 

y(w)  = f(w)  + w (3.1) 

where  f(*)  is  a random  variable  mapping  into  W;  then  {yO'^w) } is  a 
Cauchy  sequence  in  probability  (being  the  sum  of  two  such  sequences)  and 
the  limit  is  independent  of  the  particular  sequence  (P  } chosen.  Hence  y(w) 
induces  a weak  distribution  on  W.  Call  it  py.  As  finitely  additive 
measures,  Py  is  said  to  be  absolutely  continuous  with  respect  to  p^  if 
given  any  e > 0,  we  can  find  6 > 0 such  that  for  any  cylinder  set  C, 

py(C)  < e 

as  soon  as 

PgCC)  < 6. 

The  definition  of  the  derivative  however  is  more  involved.  For  our  purposes, 
we  shall  be  concerned  with  the  case  where  the  derivative  is  a random- variable. 
That  is  to  say,  there  exists  a function  f (w)  mapping  W into  R-*-  such  that 
f(w)  is  a random  variable  and  for  any  cylinder  set  C: 

p,(C)  = lim  / f(P  w)dp 
y m JQ  m G 

where  {P^}  is  any  monotone  sequence  of  finite  dimensional  projections 
converging  strongly  to  the  identity. 

Let 

Ws  = L2  [(0 ,T) ; H.] 
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where  H is  a separable  Hilbert  space.  Let  y _ denote  Gauss  measure  thereon, 
s G 

and  let  w denote  points  in  W,.  (The  subscript  s stands  for  'signal').  Let 
s & 

w = w ®w 

2 s 

the  Cartesian  product  and  induce  the  product  Gauss  measure  y^  on  : 

M2<cs  ® « = P=(CS)  mg(C) 

where  C is  a cylinder  set  in  W and  C a cylinder  set  in  V/.  Define 
s s 


Denote  points  in  VJ^  by  : 


"2=  ]* 


y(o).)  - f(w  ) + to 
J 2 s 


(3.2) 


where  f ( • ) is  a random  variable  mapping  W into  W.  Let  y denote  again  the 

s y 

(finitely  additive)  measure  induced  by  y( • ) . We  wish  to  prove  tire 
absolute  continuity  of  the  measure  y ( • ) with  respect  to  the 
measure  y„(  • ) and  to  find  the  corresponding  derivative. 

For  the  Wiener  process  version  of  (1.2),  such  a result  appears  to  have 
been  first  developed  by  Duncan  [7]  for  the  case  where  f(ws)  is  a diffusion 
process.  See  also  [8]  as  may  be  expected,  our  result  has  a superficial  similarity  to 
Stratanovich  version  [9,  eq.  12]. 

Let  H be  finite  dimensional:  H = Rn. 
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Theorem  2.1  bet  fCui^)  denote  a randan  variable  mapping  W into  W such  that 

>j  s 

E(| |f(ws)| |2)  < « (3.3) 

Let  yCw^)  be.  derined  by  (3.2). 

Then  u is  absolutely  continuous  with  respect  to  u and  the  derivative 

y o 

is  a randan  variable  (wliite  noise  integral) , corresponding  to  the  function 
g(w)  defined  by: 

g(w)  = f (Exp  - l/2(Hxl  l?  - 2 [x,a>]})  dMq  (3. if) 

J W b 

where  x is  a durnny  variable  denoting  points  in  W,  and  (•)  is  the  countably 
additive  measure  induced  by  f(*)  on  the  Borel  sets  of  W.  More  precisely: 

lim  E(Exp  i [f(P  w ),h]) 

„ ms’ 

m 

= CCh) 

= f oi[“’h]dM 

where  P is  any  monotone  sequence  of  finite  dimensional  projections  converging 
strongly  to  the  identity'. 

Proof 

With  u denoting  the  (countably  additive)  measure  induced  by  f(m  ) on  W, 
s s 

define  for  each  <*>: 

g(oj)  = Exp  - 1/2 { | lx|  | 2 - 2 tx,u]}  dps 
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This  is  well  defined  since  the  integrand  is  continuous  in  x,  non-negative, 
and  bounded  by 

Exp  1/2  ||(.'||2. 

Moreover  g(w)  is  actually  a continuous  functional  on  W.  For,  given  e > 0, 

we  can  find  a closed  bounded  K such  that 

e 

p (K  ) > 1 - c. 

o £ 

Then 

f Exp  - 1/2  {| |s| |2  - 2 [s,w]}  dp 

J K S 

£ 

is  continuous  in  w and  on  the  complement  K^,  the  integral  is 
< (exp  -W- ■ ) £ 

Not  let  us  show  that  g(w)  is  a random  variable.  Let  {P  } denote  a monotone 

m 

sequence  of  finite  dimensional  projections  on  W strongly  convergent  to  the 
identity.  Let  {<fa}  denote  a corresponding  orthonomal  basis,  with  the 
range  of  being  the  span  of  1110  first  m members  of  the  sequence.  Let  us 
note  that  we  can  write 

g(P  w)  (Exp.  - 1/2  | | x — P x|  | 2 - 1/2  { | |Px|  | 2 - 2 [P  x,to]}).  dp*. 

" m J\i  m 1 1 1 1 m 1 1 m a 

and  hence  is 

1f  ^xp  - l/2{||Pmx||2  - 2 [Pmx,w]}  dps. 


in 


lx* 

gjn(w)  = r Exp  - 1/2  {||Pmx||2  - 2[Pmx,w]}  dMs 

*W 

= R (P  w) 
mn  m 

Then 

* 

X ^n(w)dHl  =X(XEXP  " 1/2{||pmx|f2  ' 2 [Pmx’w]}  d,J)  d^s 
= 1 

Next 

E(lg(Pmw)  " 

= f f(  1 - Exp  - 1/ 2 1 | x-P  x|  |2)  Exp  - 1/2 { | |P  x|  |?  - 2 [P  x,co]}  dp^.dy^ 
vj  m m mob 

=X(1“EXP'  1/21  |x-  PmX,,2)dyS  (3.5) 

< e for  all  m > m ( e ) . 

Hence  the  conveyance  properties  of  {g(Pma))  } are  the  same  as  that  of  {g^(ui)}. 

The  latter  sequence  is  a martingale.  At  this  point  rather  than  repeat 
traditional  arguments,  we  shall  exploit  them  and  thereby  also  show  the 
connection  to  the  Wiener  process  version.  Thus  let 

Cy(to) , <Jn]  = yp  = + Ci 

20  ' 

- a 


where 


xi  = fx>  Ci  = ^ 

Here  the  , i = l,..n,  for  any  finite  n are  independent  zero  mean,  unit 
variance  Gaussians.  We  can  create  a "probability  space"  with  a countably 
additive  measure  on  it  such  that  for  any  finite  number  of  co-ordinates  we 

. Oo 

have  die  same  distributions:  namely  R for1  the  space,  and  the  sigma- algebra 

6 generated  by  cylinder1  sets,  for  the  Borel  sets.  Equivalently,  we  could 
use  C[0  ,T]  the  Banach  space  of  continuous  functions  with  range  in  Rn,  (with 
the  usual  sup  norm)  as  the  space  by  defining  the  mapping  W into  C[0,T]  by: 

S(t)  = f x(a)da  0 <_  t <_  T 

J 0 


and  W(t)  to  be  standard  Wiener  process  on  C[0,T]  and  defining 


Y(t)  = f x(a)d a + W(t) 

J 0 


(3,6) 


with  the  Wiener  measure  and  the  measure  induced  by  S(  • ) independent.  In  this 
way  we  get  a "co-ordinate  free"  representation,  and  we  note  that  the  variables 


/0  dY(t) ] = yi 

have  tire  sane  finite  dimensional  distributions  as  before.  Moreover  tire 
variables  g^(oj)  have  a corresponding  interpretation  and  have  tire  sane 
distribution  for  any  finite  m,  and  under  the  condition  (3.3),  we  know  that 
the  measure  induced  by  Y( • ) is  absolutely  continuous  with  respect  to  Wiener 
measure,  the  martingale  sequence  converging  to  it  in- the  mean  of  order  one. 
The  derivative  itself  is  given  by  (see  Duncan  [7])  by: 
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ExCExp  - 1/2  fQ  x(t)2dt  -2  fQ  x(t)dW(t) 


(3.7) 


where  E [ ] denotes  expectation  with  respect  to  the  measure  induced  by  the 

x 

process  S ( ■ ) on  CCQ,Tjj  the  (Ito)  integral  (the  processes  being  independent ) , 


f x(t)dW(t) 

JO 


being  the  same  as: 


E x.  b where  x.  = /n  [<J>^(t),  dS(t)] 
„ l i i J0  i 


'i  L 


dW(t) ]. 


We  have  thus  proved  that  g(Pmw)  is  Cauchy  in  the  mean  of  order  one, 
and  such  sequences  are  equivalent  as  we  change  basis.  Moreover,  it  readily 
follows  that  for  cylinder  sets  C: 


u (C)  = limit  f ^(w)  dMG 
J m j c. 


- limit  / g(Pmu)dijg 
m JC 


This  concludes  the  proof  of  tire  Theorem. 
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Corollary 


For  any  t,  0 _<  t <_  T,  let 
W(t)  = L2  [[0 jt] ; R11] 

W (t)  = L0  [[0,1];  H] 

lot  y^  denote  the  Gauss  measure  on  W(t)  and  y ^ similarly  the  projection  of 
y^  on  the  sub-sigma  algebra  of  Borel  sets  in  W(t).  Then  the  statement  of 
the  theorem  applied  to  measures  on  W(t)  reads: 

gCt;  P(t)u)  = /*Exp  “ 1/2  n |P(t)x|  [2  - 2 [P(t)x,m]}  dys 

where  P(t)  denotes  tlie  projection  of  W on  W(t). 

Proof  The  proof  is  immediate.  We  state  it  rather  to  note  that  we  cannot  take 
derivatives  (with  respect  to  t)  in  this  formula  as  we  can  in  the  Wiener 
process  version. 

Remark  The  Theorem  holds  for  any  countably  additive  measure  y^  on  the  Borel 
sets  of  W,  not  necessarily  generated  by  a random  variable  f(w  ). 

Let  us  note  that  the  main  virtue  of  the  theorem  is  not  so  much  the 
formula  (3.4)  but  rather  that  the  derivative  is  a random  variable . The  latter 
has  been  proved  for  a related  but  mare  general  problem  in  [10]  under 
additional  assumptions.  We  explore  this  in  the  next  section. 

The  'Linear'  Case. 

Mostly  to  illustrate  the  ideas  involved,  let  us  consider  the  special 

case  where  f(to  ) is  linear.  Thus  let 
s 

y(w2)  = L u + u (3.8) 

where  net-/  we  allow  II  in  the  definition  of  W to  be  infinite  dimensional,  and 
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where  L is  a linear  bounded  transformation  on  Wg  into  W.  Then  wc  note  that 

in  order  for  In j to  be  a random  variable  it  is  necessary  and  sufficient  that 
s 

L be  Hi lbert- Schmidt  . Hence  let  L be  Hilbert-Sdunidt . Then  y(u>2)  being 
Gaussian,  it  is  completely  characterized  by  the  corresponding  covariance 
operator: 

I + LL* 

Since  LL*'  is  certainly  Hilbert-Schmidt  (actually  of  course  trace-class),  we 
can  apply  die  Krein  factorization  theorem  to  obtain  the  representation 

(I  + LL*)"1  = (I  - f/)  (I  - i?' 

where  r£  is  a Hilbert-Schmidt  Volterra  operator: 

f = p;  p(t)  = f k(t,s)f(s)ds  a.e.  0 < t < T 

~'o 

napping  VJ  into  itself.  In  particular  we  note  that 
z(w2)  = y(u>2)  -f£ y(‘) 
also  defines  white  noise ; and  defining 

(I  + M)  = (I 

where  M must  then  be  also  Hilbert-Schimdt  and  Volterra,  we  note  that  we  can 
represent  y(w2)  a^so  331 
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y(w2)  = Wz(o»2)  + z(w2) 


(3.9) 


In  this  form  we  can  seek  the  derivative  of  the  weak  distributions  induced 


1 


by  y(.)  to  Gauss  measure  (induced  by  zCa^)  but  the  processes  are  no  longer 
independent.  However  it  is  shown  in  [10]  that  the  derivative  is  a random- 
variable  if  and  only  if 

M + M* 

is  trace-class.  But,  in  the  present  instance  this  readily  follcws  from  the 
fact  that  lii  is  trace-class,  since 

Lb':  = MM*  + (M  + M") 

In  other  words  in  the  model  (3.9),  the  conditions  that  M be  trace-class  is 
always  satisfied  if  it  is  deduced  from  the  model  [3.1,].  Incidentally,  it  is 
of  interest  to  note  that  the  derivative  is  given  by: 

g(w)  = Exp  - 1/2  [j  |Ma)|  1^-2  [Mu>,w]  + Tr(M  + M ) J (3.10) 

and  can  be  deduced  from  (3.4).  Also  it  should  be  noted  that 

Tr(M  + M*)  = Tv  (ST  + </*) 

and  also 

= EC  | |x  -5?y|  |2];  x = (3.11) 

The  last  formula  is  {particularly  interesting  since  it  has  a variational 

j’l 

interpretation.  Since  Lw  is  such  that  the  covariance  LL  is  trace-class 

s 

we  can  formulate  the  problem  of  minimizing 

E[  | | Lu»s  - Ky(w2)  | |2]  (3.12) 


25 


over  the  class  of  all  Hilbert-Schmidt  Volterra  operators  K.  But  to 
shew  that  a minimum  exists  and  is  given  by  the  II.  S.  Volterra  operator  K 
it  is  enough  to  show  that 


3X  E(||Lws  - (Ko  +XX)y(cu2)  | |?]|  = 0 


X = 0 


”77“  r iChw 
dA  s 


CKq  + XK)y(co2),  <fu]Z  = 0 


X = 0 


pjv  Tr.  (K  + XK)  (I  + LL“)  (K  + XK)“  - 2 LL"(K  + XK)"}  = 0 


wiuai  y.i.ej.c»y 


Tr.  (Ko(I  + LL  ) - LL  )K  =0 


for  which  it  is  necessary  and  sufficient  that 


K (I  + LL  ) - LL 
o 


be  the  ajoint  of  a H.S.  Volterra  operator.  But  substituting  //'for  K , we 
see  that 


y(I  + LL  ) - LL 


= (r-  I)  (I  + LL  ) + I 
= - (I  - A*)"1  + I 


= - (I  + M ) + I 


Hence  yields  the  optimal  miniiuising  H.  S.  Vol terra  operator.  The  main 
point  to  be  noted  here  is  that  existing  of  an  optimal,  solution  to  the 
minimisng  problem  (3.12)  is  equivalent  to  that  of  the  Krein  factorization. 
Whether  L is  Volterra  or  not  plays  no  role. 

Conditional  Exi  jactation:  Bayes  Formula 

Let  us  note  now  one  important  by-product  of  Theorem  2.1.  Lei  <■(  • ) 
be  any  element  of  W.  Then  by 

E[[f(tog)  , <j>]  | y(e>2)]  (3.13) 

we  shall  neon  the  limit  of  the  Cauchy  sequence  (in  the  naan  of  order  two): 

E[[F(ws),  4>]  | Pny^2);i  (3,14) 

where  P is  a sequence  of  monotone  increasing  finite  dimensional 
n 

projections  converging  strongly  to  the  Identity.  It  is  implicit  that 
this  limit  is  independent  of  the  particular  sequence  P chosen.  We  can 
then  state:  (Baye's  Formula) 


Theorem  4.1 


E[[f(u>  ),  $3  | y(ux,)3 

w>  / 

[S,<f]  Exp  - 1/2  { | | S ( j 2 - 2 CS,  y<u2)]}  dus 
X ExP  ~ 1/2  {| |S| |2  - 2 [S,  y(u2)]}  dps 


(3.15) 


Remark  Note  that  (3.15)  is  defined  fox-'  every  y in  W. 

Proof 

Given  Die  monotone  sequence  of  f irate  dimensional  projections  {Pn } , 
we  may  consider  an  orthonon;ol  basis  {<}i  } for  W such  Diat  P^  corresponds 
to  the  space  spanned  by  the  first  n.  Then  we  can  calculate 

E[Cf(wg),  4^  3 j Pny(w2)3 
by  the  (finite-dimensional)  Bayes  rule: 


[S,  4>13  Exp  - 1/2  { | | PnS | J 2 - 2 [PnS,  y 3 } - dps 
Exp  - 1/2  {||PnS||2  - 2 [PnS,  y 3}  dps 


and  obtain  in  the  limit,  the  formula  (3.15)  with  4^  for  <j).  The  formula 
for  arbitrary  4>  is  then  immediate  Drerefrom. 

Corollary 

Let  P(t)  denote  Die  Projections  W onto  W(t).  Then  for  any  4>  in  V. , 


and  0 <_  t <_  T, 

E CP(t)  f(us),  P(t)<(>3  | P(t)y(w2)3 


[P(t)S,  P(t)<J)3  Exp  - 1/2  { | | l>(t)S | | 2 - 2 [P(t)S,  P(t)y33.  dpg 
Exp  - 1/2  { | | P(t )G | | 2 - 2 [P(t)S,  P(t)y3).  dus 


Prxof  Ihe  proof  is  imrrdiate 
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(3.15) 


Likelihood  Ratio:  General  Case 

Let  us  new  consider  the  general  case  where  the  signal  process  is  not 
necessarily  Gaussian.  Let 


y(t)  = S(t)  + N(t)  0 < t < T < <« 

where  S(-)  and  N(-)  are  independent  processes.  We  shall  assume  that  the  signal 
S(*)  has  finite  energy:  (corresponding  to  3.3) 

rT  ? 

I E( | ) S(t) | | ;dt  < “ 

*0 

For  each  t,  0 < t < T,  let 


W(t)  = L2[Rn;  (0,t)] 

We  shall  shorten  W(T)  to  simply  W.  Under  condition  (3.3),  the  process  S( •)  induces 
a countably  additive  measure  on  W (and  hence  on  W(t)  for  eadi  t).  [The  cylinder 
measure  on  W can  be  extended  to  be  countably  additive,  in  other  words;  this  is  a 
consequence  of  the  Sazonov  theorem].  Ihus  y(.)  defines  a weak  distribution  on 
W defined  by  the  characteristic  function: 

E[e  Cy’h]]  = Cg(h)  Exp  - 1/2  ||h||2  (3.17) 

where 


C (h)  = K[e3CS’h]] 

S 


where  we  Liave  used  the  inner1- pi oduct  notation: 


CS(t),  h(t)dt,  h e W. 


Hien  the  cylinder  measure  induced  by  y(’)  is  absolutely  continuous  with  respect 
to  Gauss  measure  uG  and  the  Radon-Nikodym  derivative  is  defined  by  the  function: 
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J 


f(u>) 


= f Exp  - 1/2  {|  |s|  I2  - 2 [S,w)> 

J W 


(3.18) 


Thus  for.'  any  cylinder  set  C, 


u ,(C)  = limit  f f(P  to)  du„ 
y n ■*<*  -k  n G 


where  P is  any  sequence  of  finite  dimensional  projections  strongly  convergent 
to  the  identity. 

Let  {<>n}  be  an  orthonormal  basis  in  V)  and  let  L denote  the  mapping  of  W 


into  : 


r T 

Lx  = a;  a = / [x(o) , <f>  (cr)]da. 

n •'n  n 


Let 


LS  = c 


Let  denote  the  measure  induced  on  by  this  mapping.  Hien  we  can  rewrite 
(3.18)  in  the  form 


f(w)  = f Exp  - 1/2  - 2 U,  Lu]}  dp 


(3.19) 


It  must  be  emphasised  that  (2.6)  is  defined  for  every  element  co  in  W.  Note  also 
that  (3.19)  can  be  defined  with  respect  to  any  orthomormal  system  {<J>  }. 

Let  us  next  consider  the  likelihood  functional  f(y)  where  y( • ) is  the 
observation.  Tor  this  purpose,  let  (3.19)  be  defined  with  respect  to  the 
orthornormal  system  {4^}.  For  each  t , 0 < t <_  T , define  the  operators  A(t), 
mapping  into  by: 

± 


a(L)x  = a:  a = / [d>  (o) , x(o) 

n J n 


]do 
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f 


Let 

R(t)  = A(t)  A(t)*. 

Then  the  Radon-Nlkodym  derivative  of  the  measure  induced  by  the  process  y(- 
over  [0,t]  with  respect  to  Gauss  measure  on  W(t)  is  given  by: 


f(t,u))  = J Exp  - 1/2  {[R(t)  £,<;]  - 2 [?,  a ( t)oj J } dy^  (3 


Note  that  a(T)  = L.  Let  P denote  the  projection  operator  con'esponding  to 
first  n basis  functions  (4^  } , i = Then  we  define 

J(t)  = lim  E[c|A(t)  Pny] 
n 


As  we  have  seen,  we  have  (Bayes  Formula)  that 


f ? Exp  - 1/2  {[R(t)£,U  - 2 U,  A(t)y]} 


dM, 


J l 


C(t)  = 


C Exp  - 1/2  {[R(t)C,^]  - 2 U,  A(t)y]}  dy^ 

n 


Note  that,  by  Schwartz  Inequality 

1 2 


u<t)||2  < 


J t Hell  ExP  - 1/2  {CR(t)C,c3  - 2 tC,  A(t)y]}  dyc 

Exp  - 1/2  {[R(t)C,C3  - 2 U,  A(t)y]}  dy^ 

| | c| |2  Exp  - 1/2  | |R(t)c  -A(t)y||2  dy 


/. 


Exp  - 1/2  | | R(t)c  - A(t)y||  _dy£ 


< c Et|U||2]  Exp  + 1/2  ( | | A(t)y  | | +k)?0<c<~,0<k<-( 


20) 

the 


3.21) 


31 


It  should  1x2  noted  that  such  an  estimate  is  not  available  in  the  Wiener  process 

version.  Moreover  we  shall  show  that  (3.20)  is  actually  obsolutely  continuous  in 

t with  an  -derivative.  let  <p(t)  be  infinitely  differentiable  with  compact 

support  in  (0,T).  Then 
T 

f [f(t,u)  4> ' (t)]dt 

JO 

r rT 

= I [Exp  - 1/2  {[R(t)C,C]  - 2 U,  A(t)u]}  4>'(t)dt]}  dp 
Ji2  J0  5 

-f  ( fT  - 1/2  ll^Ct)  C±|  I2  + [E^Ct)  Ci}  w(t)])  /exp  - 1/2  {[R(t)C,rJ 

Ji2'J  0 1 1 V 

- 2 [t;,  A(t)w]}  4>(t )dt ^ dy^. 


where  we  note  that  both 


||E<Mt)c,|r  ^ tE*,.(t)c.,  u(t)] 

1 * 1 " 1 

are  in  U,  [0,T]  for  each  ( i n tb,*  Hence  the  derivative  is  (defined  a.e,  0 < t < T) : 

OO 

f (-  1/2  ||E4>,(t)t.  ||2  + [ZA.(t)e.,  u(t)])  Exp  - 1/2  {[R(t)?,rJ  - 2 [(,  A(t)u]}  dp 
Ji,2  i 1 x x • C 

we  shall  next  prove  that 

N A 

^,(t)  = E ♦i(t);i(t)  0 < t < T 

converges  in  the  norm  of  W.  But  this  is  immediate  from  the  fact  that  analogous 


to  (3.21): 


leN(t)|[  < c E[||E<f..(t)c.||?]  Exp  + 1/2  ||A(t)y|i2  a.e.  0 < t < T 


A “A 

S(t)  = E4.(t)c.(t) 
1 1 
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and 


f | |E4». (t>^- 1 |2  Exp  - 1/2  {[RCtk,^]  - 2 U,  A(t)y]}  dp 
^2  11  1 


|S(t)| 


ft. 


Exp  - 1/2  {[R(t)C,«]  - 2 [?,  A(t)y]}  dp 


S(t)  11} 


Then  from  (2.13)  we  can  write: 

--  log  f(t,  y)  = - 1/2  {||S(t)||2  - LSCt),  y(t)2  + ||SCt)»»2  n^'"?' 

at 

and  hence  finally,  for  the  log  likel.Lhood  functional: 

Log  f(y) 

,»T  A „ /•  T A 

:t)  I I dt  - 2 

'0  *'0 

»T 


- 1/2  f f 1 ||S(t)||2dt  -2  f CS(t),  y(t)]dt 

•/n 


/ [||s(t)| 

*/0 


? . ||S(t)||2]dt-} 


(3.22) 


we  note  that  the  third  term  can  also  be  expressed  as 


limit  E[ j |S(t)  - S(t) 1 | 2 

n ->  co 


A(t)Pny] 


Ihe  formula  (3.22)  differs  from  the  Wiener  process  version  in  the  ,.vpear.tncje 
of  the  thir  term;  in  the  case  where  S(t)  is  Gaussian,  we  knew  that  thi  ivjucvs 

to 

E[  | |S(t)  - S(t)||2] 

whidr  is  then  also  independent  of  the  observation  y(")  as  we  have  already  : con. 


4.  Dynamic  Systems. 


Finite  Dimensional  Case; 

We  wish  now  to  specialise  our  results  to  the  case  where  S(8,t)  has 
a stochastic  differential  system  representation: 

S(0,t)  = C(0)  X (0,t) 


= A(0)  x(0jt)  + F(0)  w(t);  x( 0 , 0 ) = 0. 


(4.]) 


and  the  observation  process  has  the  form: 


y(0,t)  = S(0,t)  + G u)(t) 


(4.2) 


where  we  shall  first  consider  the  finite  dimensional  case  so  that  C(0), 
A(0),  F( 0 ) , G are  all  rectangular  matrices  with, 


A(9) : m x m,  FvO)  • m x ji. 


F(0)G  = 0 


GG  = Identity  matrix 


We  take  w(  • ) as  sample  functions  of  white  noise  in 


W = L2r(0,T);  Rnl. 

Now  equation  (4.1)  for  each  fixed  0 has  (see  [10])  the  unique  solution. 

x( 0 ,t ) = f"  cA(0)  (t~G)  F(0)  w(s)ds  0 < t < T 

*'0 

and 

x(0,t)  = Lw 

defines  a Hilbert -Schmidt  operator  on  W into 
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we  use  the  iteration  (modified  Newton-Raphson) : 


Vl  = °n  - M(t> 


n'  '6h'  n’ 


where  M(0;T)  is  the  matrix  with  comronents: 


m. . (0) 


■/ 


S(0,t),  ~ s(0,t)]dt 

j 


where  {cc  } denote  the  'components'  of  0.  We  assume  that  M(0,T)  is 
positive  definite  on  the  set  of  admissible  parameters  6. 

Infinite-Dimensional  Case 

The  extension  of  (4.1)  to  the  infinite  dimensional  case  (corresponding 
to  partial  differential  equations)  can  take  many  forms.  One  version  is 
treated  in  [10].  For  each  0,  A(0)  in  (4.1)  is  now  the  infinitesimal 
generator  of  a strongly  continuous  semigroup  over  a separable  Hilbert  space 
Hy.  Equation  (4,1)  remains  formally  the  sane,  with  f(G)  being  a 3 meet* 
bounded  transformation  for  each  0,  mapping  II  into  Hg,  w(*)  denoting  white 


noise  in 


W = L„[ (0 ,T] ; II] 


H being  a separable  Hilbert  space. 


Similarly  C(0)  is  assumed  to  be  linear  bounded  and  G linear  bounded  with 


F(0)G  = 0;  GG  = Identity 

In  that  case  the  finite  dimensional,  version  (4.4)  goes  over  without  change 
provided  we  assume  that 


J 9 

/ E( | |C(0)  x (0;t) | | )dt  < 

.Jo 
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This  in  particular  implies  'that 


1 

I 


C(G)  P( 0 jt)  C( 0) 


is  trace-class  a.e.  and  that 


ft 


Tr. C(0)  P( 0 ,t)  C(0)  dt  < 


Hcwever  from  the  practical  point  of  view  we  need  to  consider  the  case 
where  C(0)  is  allowed  to  be  unbounded,  and  uncloseable,  (corresponding 
to  ' boundary ' or  ' pointwise ' observations  in  distributed  parameter  systems ) . 

Here  we  shall  consider  such  an  extension  that  takes  care  of  the 
application  to  the  case  of  turbulence  with  non-rational  spectrum  (see  Sec.  6). 
Actually  the  model  we  shall  study  represents  a wide  variety  of  situations 
assuming  only  linearity.  Thus  we  take: 


y(t)  = S(t,G)  + nx(t)  0 < t < 


(4.5) 


where  n^( • ) is  white  noise  in 


W0  = lytO.T);  V 


and  S(t,0)  has  the  form 

ft 
'0 


S(t 


,0)  = f B(G ; t - s)  u(S)ds  + f F(0;  t - s)n9(s)ds  (4.6) 
Jn  J fl  1 


where  n2(‘)  is  white-noise  (independent  of  n^(*))  in 


Wg  = L2((0,T);  y, 


u(*)  is  a knam  ( deterministic)  function,  and  / | |u(t) 

•^0 


dt  < 00 
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and  for  cadi  0 : 


+ 


F(G;o) j |2do 


< oo 


(4.7) 


Note  that  (4.1),  (4.2)  form  a special  case  of  (4.5),  (4.6),  (4.7), 
where  the  Laplace  transforms  of  B(G,o)  and  F(G,o)  are  constrained  to  be 
rational  functions.  To  handle  the  generalization  when  one  (or  both)  is 
not  necessarily  rational,  we  proceed  (see  [12])  as  follows.  We  show  that 
we  can  rewrite  (4.7)  in  terms  of  a partial  differential  equation  representation. 
Thus  let 


H = ^[0,  »j  Rp] 

where  p is  the  dimension  of  the  observation.  Let  A denote  the  generator 
of  the  shift- semigroup  over  H: 


0(A)  = [f  e II  | f ( • ) is  absolutely  continuous  and  the 
derivative  f ' ( • ) e H] 


and 


Af  = f' 

Let  u(t)  be  an  m x 1 natrixf unction.  Let  B(0)  denote  a.  linear  bounded 
operator  mapping  Rm  into  II  defined  by: 

B(0)  u = g;  g(t)=B(0,t)u  0<t<  ~,  ueF^ 

Let 
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so  that  coCO  is  white  noise  in 


L2((0,T);  Rp  x Rn) 


Let  F C 0 ) denote  the  lineal'  bounded  operator  mapping  R^  x into 
by 

F(0)v  = g;  g(t)  = F(0;t)n2 , v = 

Finally  let  G denote  the  mapping  of  x into  R defined  by 

nl 

Gv  = w,  w = n., , v = , nn  e R , n„  e R . 

’ 1’  n2  1 p 2 n 

Then  we  claim  that  (4.6)  is  representable  as: 


/ni\  ni £ RP 

V1?/’  n2  £ Rn 


x(G,t)  = A x (t)  + B(Q)u(t)  + F(Q)  w(t) ; x(G,0)  = 0 


y(t)  = C x(05t)  + Gw(t) 


where  it  should  be  noted  that 

FCCOG*'  = 0;  GG“  = I 


where  C is  the  operator  defined  by 

Domain  of  C = [f  e H | f(»)  is  continuous  in  0 <_  t < 03 ] 


Cf  = f(0) 


We  assume  that  R(0 ,t) , F(G,t)  are  locally  continuous  in  0 <_  t < 00 . 
can  readily  see  that  x(G,t)  is  then  in  the  domain  of  C for  each  t. 
(4.8)  is  the  same  as  (4.7)  follows  from  the  representation: 


defined 


(4.6) 


We 

That 


x(0,t) 


-f 


S(t  -a)  B(0)u(a)da 


S(t  - a)  FCO)  w(a)da 


where  S(t)  is  the  semigroup  (shift)  generated  by  A.  Even  though  C is  not 
closeable,  C x(8,t)  is  defined  and  is  locally  continuous  in  0 <_  t < «, 
for  each  u( • ) and  w(*).  We  can  then  (see  [12])  deduce  the  analogue  of 
(4.3a),  (4.3b)  as: 

x(0,t)  = A xC0,t)  + (C  P(0 ,t) )*  [y(t)  - C x(0,t)]  x(0,O)  = 0 (4.9) 


where  P(0,t)  satisfies: 

[P(0 ,t)  x,  y]  = [P(G,t)x,  A y]  + [P(0,t)y,  A x] 
+ [F(0)*x,  F(0)*y] 


- [CP(0,t)x,  CP(0,t)y] 

P(e,0)  = 0,  (4.10) 

* 

x,  y c Domain  of  A . 

In  particular  P(0,t)  maps  into  the  domain  of  C,  and 
C P(0,t) 

is  linear  bounded  (even  though  C is  t closed;  see  [12])  for  each  t. 
Moreover  (cf[12]): 

it 

(C  P(0,t))  c Domain  of  C 


and 

* 

C (CP(0,t))  is  bounded  (and  automatically  trace-class 


being  finite  dinensional) 


The  Radon-Nikodym  derivative  formula  (4.4)  now  becomes 


/*]  A O /•  1 A 

Exp  - 1/2  / | |S(0,t) j | dt  - 2 / [S(0,t),  y(t)]dt 

J n Jn 


T * 

+ / Tr.  C(C  P(0,t))  dt 
/0 


X 


(4.11) 


In  this  version  it  is  important  to  note  that  the  'steady-state' 
solution  of  (4.10)  exists: 

P(0,  °°)x  = limit  P(Q,  °°)x 
0 = [P(0,  »)x,  Ay}  + [P(0,  x)y,  Ax] 


+ [r(0)“x,  F(0)"y]  - [C  P(0,  «)x,  C P(0,  ~)y] 


(4.12) 


provided 


do 


dt  < 00 


(4.13) 


5 . Application 

We  turn  now  to  an  application:  estimation  of  stability  and  control 

derivatives  from  flight  test  data.  The  dynamic  system  considered  arises 
from  the  longitudinal  mode  perturbation  equations  for  an  aircraft  in 
windgust  (turbulence)  (Rediess  Taylor,  see  [13]).  Vie  use  the  Dryden 
version  of  tire  spectrum  of  turbulence,  which  is  rational,  so  that  the  total 
system  is  finite  dimensional.  Leaving  the  many  essential  details  to  the 
comprehensive  work  of  Iliff  Lll],  the  state  space  formulation  of  the  problem 
is  as  follows:  (see  also  [12]): 

x(t)  = A x(t)  + B u(t)  + T n2(t) 

V(t)  = C x ( t ) + D u(t)  + G np(t) 
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where  n^(*)}  and  ( • ) are  independent  whit  Gaussian,  and  the  matrices 
in  the  equations  have  the  form: 


g = acceleration  due  to  gravity 
G = diag.  L.OOUb,  .OUUl,  .01,  .0001] 
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Ihe  lettered  entries  (stability  and  control  derivatives)  are  unknown, 
except  for  v,  which  is  kncwn  (-1670).  Note  that  the  turbulence  paver  is 
an  unknown  parameter  also. 

The  sampling  interval  was  0.02  seconds  while  t data  bandwidth  i 
about  51 1^.  Figure  1 shows  the  complete  time  liistory  of  the  ob.  • rvition 
v(t)  (four  components)  subdivided  into  various  region:;  for  later  identi- 
fication, as  well  as  the  input  time-history.  Estimates  were  computed 
over  the  various  subregions  each  by  three  methods : 

Method  1:  Neglecting  the  measurement  noise  on  the  angle  of  attack 

measurment  (v^)  and  following  the  corresponding  maximal 
likelihood  technique  developed  in  [13].  This  is  reasonable 
for  this  particular  example  at  high  turbulence  levels. 

Method  II:  This  is  the  method  developed  herein. 

Method  III:  This  was  a "check"  method,  in  which  the  turbulence 
was  ignored  completely  in  the  model . 

The  results  are  summarized  in  Figure  2.  Sample  means  and  variances  ■ 
of  the  estimates  obtained  over  the  different  data-regions  are  shown,  along 
with  the  wind-tunnel  values  as  well  estimates  obtained  on  other  turbulence- 
free  (smooth  air)  flights.  It  can  be  seen  that  Method  II  yields  the  most 
consistent  estimates.  It  also  turns  out  that  Method  II  is  the  least  in 
computational  time  — the  estimates  converging  in  fewer  iterations.  It 
can  also  be  seen  that  ignoring  the  turbulence  leads  to  the  worst  results. 
For  more  discussion  see  [11].  The  remaining  figures  indicate  the  nature 
of  the  "fit"  obtained  using  the  estimated  coefficients  to  the  observed 
data.  Figure  3 shows  the  close  agreement  provided  by  Method  II.  Figures  ‘i 


‘t3 


and  5 indicate  ho/  much  worse  the  agreement  is  on  the  same  stretdi  of 
data  if  the  turbulence  is  not  accounted  for. 

If  we  use  the  nan-rational  (Kolmogorov)  version  of  tire  spectrum  of 
turbulence,  we  have  to  use  [4.11].  In  particular  in  this  case, 


F(Q,t)  has  the  form 


5/6 

F(0,t)  = (a(6)  t 


-1/6 

+ b(0)t  ) e 


corresponding  to  the  spectral  density  of  the  form  (cf[15]): 


1 + c f 


(1  + d f2)11'6 


The  possibility  of  using  fl.ight-test  data  to  distinguish  between  the 
two  models  of  the  spectral  density  is  an  intriguing  one  at  the  present 
time. 
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